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(57) Abstract: An automatic device, of particular utility for tourists, for capturing an image at a distance and translating text within 
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means (56) for converting the textual image into correponding text data written in a source language, and (c) translation means (58) 
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SIGN TRANSLATOR 

FIELD AND BACKGROUND OF THE INVENTION 

The present invention relates to a system for automatic translation of 
5 text, and. more particularly, to a system for capturing text from a distance and 
automatically providing a text translation into the desired language. 

Known in the art are various types of scanners that are capable of 
identifying text and translating the text into another language. The input data 
scanned are converted into system data and input text data using optical 
10 character recognition (OCR) programs, which are widely available and the 
operation of which is well known in the field. OCR programs are often 
designed towards scanning and recognizing alphanumeric text. Some OCR 
programs are designed towards scanning and recognizing ideographic 
characters, such as Japanese kana. 
15 An example of one popular commercial application is Quicktionary™, 

a portable, handheld scanner that scans and automatically displays a 
translation. The built-in OCR program is capable of reading and identifying 
multiple fonts and type sizes. The internal memory contains several hundred 
thousand words and idioms. 
20 The scanning translators of the known art operate by pressing the paper 

or surface bearing the text (or characters) against the scanning surface of the 
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scanner. In portable scanning translators, the entire scanner is brought near 
to the text-bearing surface at which point the scanning is initiated. 

Known translation systems are incapable, however, of reading and 
translating text or characters from afar. This is a serious disadvantage, as 
5 often the text-bearing surface is not under thumb and may be substantially 
inaccessible. For example, millions of tourists each year travel in countries 
in which for them, the local alphabet, including traffic signs, street signs, 
advertisements, is incomprehensible. Although in theory it is possible to 
carry hand-held portable translators and type in the words or sentence that 

10 needs translation, in practice such use is clumsy and time-consuming. 
Moreover, hand-held portable translators often do not have the capability of 
entering ideographic characters and forms belonging to foreign alphabets 
such as Cyrillic or Hebrew. Finally, typographical error or improper 
identification of a foreign form or character by the tourist/user are highly 

15 probable, and may often lead to mistranslation or to an inability to provide a 
translation. 

There is therefore a recognized need for, and it would be highly 
advantageous to have, a system that would be capable of reading a 
text-bearing and/or character-bearing object from afar and translating the text 
20 and/or characters into the requisite language. It would be of particular 
advantage if such an invention would be economical, reliable, and convenient 
to use. 
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SUMMARY OF THE INVENTION 

The present invention is an automatic device, of particular utility for 
tourists, for capturing an image at a distance and translating text within the 
image, comprising: (a) photographing means for capturing an image 
5 containing a textual image, (b) character recognition means for converting the 
textual image into corresponding text data written in a source language, and 
(c) translation means for translating the text data from the source language to a 
target language text. 

According to further features in the described preferred embodiments, 
10 the target language text is provided as visual text. 

According to still further features in the described preferred 
embodiments, the visual text is provided on a visual display. 

According to still further features in the described preferred 
embodiments, the target language text is provided audially. 
15 According to still further features in the described preferred 

embodiments, the photographing device is a digital video camera. 

According to still further features in the described preferred 
embodiments, the photographing device is a digital snapshot (still-photo) 
camera. 

20 According to still further features in the described preferred 

embodiments, the translation means for translating the text data from the 
source language to a target language text is provided by accessing the Internet. 
According to still further features in the described preferred 
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embodiments, the accessing of the Internet is performed using a portable 
computer. 

According to still further features in the described preferred 
embodiments, the accessing of the Internet is performed using a Personal 
5 Digital Assistant (PDA). 

According to still further features in the described preferred 
embodiments, the accessing of the Internet is performed using a mobile phone. 

According to still further features in the described preferred 
embodiments, the visual display is a visual display of a mobile phone. 
10 According to still further features in the described preferred 

embodiments, the automatic device of the present invention further comprises 
(d) at least one feature selected from the group consisting of: tour books, 
business information, language-learning means, maps, and travel directions. 

According to still further features in the described preferred 
15 embodiments, the automatic device of the present invention further comprises 
(d) user-identification means. 

According to still further features in the described preferred 
embodiments, the automatic device of the present invention further comprises 
(d) at least one feature selected from the group consisting of: credit cards, 
20 travel tickets, membership cards, car-rental contracts, hotel reservations, and 
entertainment tickets. 

According to still further features in the described preferred 
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embodiments, the automatic device of the present invention further comprises 
(d) an audio device for recording voice data in the source language, and (e) 
translation means for translating the voice data from the source language to the 
target language text. 

5 According to still further features in the described preferred 

embodiments, the target language text is provided on a visual display or 
audially. 

The present invention successfully addresses the shortcomings of the 
existing technologies by providing a device capable of instantly capturing an 
10 image viewed from a distance, identifying the text therein, and translating the 
text to the target language. 



BRIEF DESCRIPTION OF THE DRAWINGS 

The invention is herein described, by way of example only, with 
reference to the accompanying drawings. With specific reference now to the 

15 drawings in detail, it is stressed that the particulars shown are by way of 
example and for purposes of illustrative discussion of the preferred 
embodiments of the present invention only, and are presented in the cause of 
providing what is believed to be the most useful and readily understood 
description of the principles and conceptual aspects of the invention. In this 

20 regard, no attempt is made to show structural details of the invention in more 
detail than is necessary for a fundamental understanding of the invention, the 
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description taken with the drawings making apparent to those skilled in the 
art how the several forms of the invention may be embodied in practice. 
In the drawings: 

FIG. la is a schematic illustration of the exterior of a device according 
5 to the present invention, integrated into a digital video camera. 

FIG. lb is a schematic illustration of the inner workings of the device 
of FIG. la. 

FIG. 2 illustrates the camera display of the system of FIG. 1. 
DESCRIPTION OF THE PREFERRED EMBODIMENTS 
10 The present invention is an automatic system that captures text, usually 

at a distance, identifies the text and translates the text into the requisite 
language. 

The principles and operation of the device according to the present 
invention may be better understood with reference to the drawings and the 

1 5 accompanying description . 

Before explaining at least one embodiment of the invention in detail, it 
is to be understood that the invention is not limited in its application to the 
details of construction and the arrangement of the components set forth in the 
following description or illustrated in the drawing. The invention is capable 

20 of other embodiments or of being practiced or carried out in various ways. 
Also, it is to be understood that the phraseology and terminology employed 
herein is for the purpose of description and should not be regarded as 
limiting. 



WO 01/04790 PCT/IL00/00399 

Referring now to the drawings, Figure la is a schematic illustration of 
the exterior of a device according to the present invention, integrated into a 
conventional digital video camera 10. The digital video camera 10 has two 
main operating buttons: a first operating button 12 for filming, and a second 
5 operating button 14 for the purpose of the present invention. When the 
second operating button 14 is pressed, the text is captured, identified, and 
translated into the requisite (target) language. 

Figure lb is a schematic illustration of the inner workings of the device 
of Figure la, integrated into a conventional digital video camera 10. The lens 

10 50, which represents the optics of the device, focusses a text-containing 
image (not shown) on to a CCD (Charge Coupled Device) array 52. The 
voltage of each element of the CCD array corresponds to the light intensity 
focussed on the element. The image capturing unit 54 reads the voltage of 
each element in the array, yielding a field of pixels. 

1 5 The text within the captured image from the image capturing unit 54 is 

passed on to the character recognition unit 56, which identifies the text. 
Subsequently, the identified text is passed on to the translating unit 58, which 
translates the source text into the target language. The translated (target 
language) text is then provided visually on a display 20. Alternatively, the 

20 translated text is passed on to a speech synthesizer 62 which converts the text 
into audible speech, which is then presented to the user by means of a speaker 
64. 
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It should be emphasized that the system of the present invention can 
utilize and integrate existing technologies such as OCR, digital video 
equipment, and computerized translators such as Quicktionary™. While 
some OCR technologies require that the source language be identified by the 
5 user, others are capable of identifying the source language. The present 
invention can utilize both of these OCR technologies, as well as other kinds 
of character recognition technologies. In any event, the target language can 
be dedicated to a particular target language, loaded as a sole target language 
by the user, or loaded as one of several target languages by the user (in 
10 which case, the user must specify which target language is desired). All of 
these translating options are possible using translating components of the 
prior art. 

Figure 2 illustrates the camera display 20 of the system of Figure 1 . 
Within the camera display 20, a sign 22 having the text 24 "MARKET" is 

1 5 shown. The sign 22 containing text 24 lies within the field of translation 26. 
Upon pressing the "translate" button (the second operating button 14 of 
Figure la), the text is translated into the desired target language, in this case, 
Hebrew ( "pltU" ). The translated text 28 is then provided at the bottom 30 of 
the camera display 20. 

20 In a preferred embodiment, the field of translation 26 is a sub-frame of 

the camera display 20, preferably located near the center of the camera 
display 20. In another preferred embodiment, the field of translation 26 can 
be adjusted according to the size of the text 24. 
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In the figures and accompanying description above, the device 
according to the present invention is integrated in a digital camera, such as a 
digital video camera or a digital snapshot (still-photo) camera. Thus, the 
inventive features are built into the digital camera. In principle, such a 
5 system can include any equipment used to read or obtain visual information 
at a distance, e.g., digital binoculars or surveying instruments of various 
kinds. 

The present invention may also be an add-on unit to an existing digital 
camera, i.e., a camera that was manufactured without having the present 

10 invention in mind. In connecting the device to an existing camera unit, a 
short cable having an appropriate fitting is used, the fitting being connected 
to an IEEE 1394 outlet or similar outlet. Such outlets are typically used 
today to connect a digital video camera to a computer. Connection to a 
digital camera is typically effected via existing-type communication ports 

15 such as USB. 

In add-on systems of this type, wherein the device of the present 
invention is hooked up to existing equipment (compatible, but not embedded 
or built-in), the inventive device typically includes a display screen, 
keyboard (having one or more keys or buttons), and computer 

20 communication cable. 

In another kind of add-on unit, the device according to the present 
invention, having a dedicated element for capturing images containing text, is 
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added-on to a mobile phone or to a Personal Digital Assistant (PDA) such as 
Palm Pilot™. 

Alternatively, the present invention may be a stand-alone unit, having a 
dedicated element for capturing images containing text instead of being built 
5 into a digital camera. 

It must be emphasized that tourists are accustomed to traveling with 
cameras of various kinds. Because the device according to the present 
invention can be affixed to cameras and the like, and can even be built as an 
integral part of the camera, the portable text-capturing and text-translating 
10 device is particularly easy and convenient to use. 

The dictionary or dictionaries providing the translation can be stored 
physically within the device. In a preferred embodiment, the dictionary or 
dictionaries providing the translation can be accessed via the Internet. This is 
particularly relevant for applications in which the equipment does not 
15 necessarily have considerable computing power, as with many cellular 
phones. 

As used herein in the specification and in the claims section that 
follows, the term "text" includes sentences, phrases, words, letters and 
ideographic signs. The letters include, by way of example, Braille letters, 
20 and the ideographic signs include, by way of example, traffic signs and the 
like. 

As used herein in the specification and in the claims section that 
follows, the term "translate" refers to any of the following definitions taken 

10 
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from "The American Heritage College Dictionary" (Houghton Mifflin Co., 
U.S. A., Third Edition, 1997): 

1 . To render in another language; 
2a. To put into simpler terms; explain; 
5 2b. To express in different words; paraphrase. 

The above-mentioned definitions of the term "translate" refer also to 
ideographic signs, such as Japanese kana, and to traffic signs and the like. 
The term "translate" also refers to phonetic translation, i.e., a phonetic 
representation in the source language or in the target language. 

10 As used herein in the specification and in the claims section that 

follows, the term "photographing means" refers to means that are capable of 
capturing an image at a distance, and that are characteristically capable of 
capturing the entire image in an instantaneous or all-at-once fashion. 
Photographing means of the present invention include all kinds of digital 

15 means of capturing an image, including digital cameras (such as digital video 
cameras and digital snapshot cameras), and digital binoculars. Specifically 
excluded from photographing means are scanners, in which the 
image-recording is performed in a gradual fashion, and which require the 
object bearing the image to be substantially in contact with the scanning 

20 mechanism. 

According to still further features in the described preferred 
embodiments, the automatic device of the present invention further comprises 
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an audio device for capturing voice data in the source language, and translation 
means for translating the voice data from the source language to the target 
language text. Referring to Figure lb, the audio input is conveyed via 
microphone 80 to an audio signal digitizing unit 82. The signals are 

5 recognized as words in the recognition unit 84. The words identified in the 
recognition unit 84 are subsequently translated into the target language by 
translation unit 86. The translated (target language) text is then provided 
visually on a display 20. Alternatively, the translated text is passed on to a 
speech synthesizer 62 which converts the text into audible speech, which is 

1 0 then presented to the user by means of a speaker 64. 

Although the invention has been described in conjunction with specific 
embodiments thereof, it is evident that many alternatives, modifications and 
variations will be apparent to those skilled in the art. Accordingly, it is 
intended to embrace all such alternatives, modifications and variations that 
15 fall within the spirit and broad scope of the appended claims. 



12 



WO 01/04790 



PCT/IL00/00399 



WHAT IS CLAIMED IS: 

1. An automatic device for capturing an image at a distance and 
translating text within the image comprising: 

(a) photographing means for capturing an image containing a textual 
image; 

(b) character recognition means for converting said textual image 
into corresponding text data written in a source language; 

(c) translation means for translating said text data from said source 
language to a target language text. 

2. The device of claim 1, wherein said target language text is 
provided as visual text. 

3. The device of claim 2. wherein said visual text is provided on a 
visual display. 

4. The device of claim 1, wherein said target language text is 
provided audially. 

5. The device of claim 1, wherein said photographing device is a 
digital video camera. 

6. The device of claim 1, wherein said photographing device is a 

13 
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7. The device of claim 1, wherein said translation means for 
translating said text data from said source language to a target language text is 
provided by accessing the Internet. 

8. The device of claim 7, wherein said accessing the Internet is 
performed using a Personal Digital Assistant (PDA). 

9. The device of claim 7, wherein said accessing the Internet is 
performed using a portable computer. 

10. The device of claim 7, wherein said accessing the Internet is 
performed using a mobile phone. 

11. The device of claim 3, wherein said visual display is a visual 
display of a mobile phone. 

12. The device of claim 1, further comprising: 

(d) at least one feature selected from the group consisting of tour 
book, business information, language-learning means, map,and travel 
directions. 

13. The device of claim 1, further comprising: 

(d) user-identification means. 

14 
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14. The device of claim 1, further comprising: 

(d) at least one feature selected from the group consisting of 
credit card, travel ticket, membership card, car-rental contract, hotel 
reservation, and entertainment ticket. 

1 5 . The device of claim 1 , further comprising: 

(d) an audio device for capturing voice data in said source language, 
and 

(e) translation means for translating said voice data from said source 
language to said target language text. 

16. The device of claim 15, wherein said target language text is 
provided on a visual display. 

17. The device of claim 15, wherein said target language text is 
provided audially. 
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